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Endogenous mobile genetic elements 
can give rise to de novo germline 
or somatic mutations that can have dra- 
matic consequences for genome regula- 
tion both local and possibly more globally 
based on the site of integration. However 
if we consider them as "normal genetic" 
components of the reference genome then 
they are likely to modify local chromatin 
structure which would have an effect on 
gene regulation irrelevant of their ability 
to further transpose. As such they can be 
treated as any other domain involved in 
a gene x environment interaction. Simi- 
larly their evolutionary appearance in the 
reference genome would supply a driver 
for species specific responses /traits. Our 
recent data would suggest the homi- 
nid specific subset of retrotransposons, 
SINE-VNTR-Alu (SVA), can function 
as transcriptional regulatory domains 
both in vivo and in vitro when analyzed 
in reporter gene constructs. Of particu- 
lar interest in the SVA element, were the 
variable number tandem repeat (VNTR) 
domains which as their name suggests 
can be polymorphic. We and others have 
previously shown that VNTRs can be 
both differential regulators and biomark- 
ers of disease based on the genotype of 
the repeat. Here, we provide an overview 
of why polymorphism in the SVA ele- 
ments, in particular the VNTRs, could 
alter gene expression patterns that could 
be mechanistically associated with dif- 
ferent traits in evolution or disease pro- 
gression in humans. 



Repetitive DNA in the Genome 

In this article we will use the term 
variable number tandem repeat (VNTR) 
to encompass both microsatellites and 
minisatellites however not all tandem 
repeat domains are polymorphic in 
repeat copy number. Nevertheless there 
are estimated to be approximately 1 mil- 
lion VNTRs in the human genome . 
Their function has been demonstrated to 
include modulation of alternative splic- 
ing, thus altering protein isoform levels 
and effects in gene regulation. It's the lat- 
ter of these functions that our group has 
the most experience analyzing. We have 
demonstrated in several genes involved 
in monoaminergic transmission in the 
nervous system that, based on copy num- 
ber, VNTRs can be both biomarkers of 
predisposition to neuropsychiatric dis- 
orders direct differential tissue specific 
and stimulus ind^igble expression both in 
vitro and in vivo . We began to address 
potential VNTRs in the limited num- 
ber of candidate genes associated with 
Amyotrophic Lateral Sclerosis (ALS) 
and Frontotemporal Lobar Degeneration 
(FTLD). We identified a large tandem 
repeat domain approximately lOkb 5' of 
the major transcriptional start site of the 
FUS (Fused in sarcoma) gene using the 
UCSC genome bioinformatic site ( http:// 
genome.ucsc.edu ). Indeed, closer inspec- 
tion of the primary sequence showed the 
potential for two adjacent VNTRs which 
varied between 37-50 bp in the size of 
their repeat sequence. When BLAT was 
performed using this repeat sequence 
we identified a host of related sequences 
all sharing significant homology. This 
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tandem repeat represented the central 
domain of the hominid retrotransposon 
family termed SINE-VNTR-Alu (SVA). 



SVA Retrotransposons 

SVA elements only represent 0.13% of 
the genome, representing -2700 elements, 
constituting the youngest of the retrotrans- 
posable elements in the human genome 
and are hominid specific. They consist 
of a hexamer repeat (CCCTCT), an Alu- 
like sequence, a GC-rich VNTR, a SINE 
and a poly A-tail . Many SVA elements 
contain two central GC-rich VNTRs and 
it should be noted that the flanking hex- 
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amer is also a VNTR . Generally SVA 
elements vary in length from approxi- 
mately 1000-4000 bp with 63% of SVA 
element insertions in the human genome 
being full-length, containing all five 
domains . SVA elements are divided 
into subtypes (A-F) by the SINE region 
and more recently a seventh subtype was 
identified containing a 5' transduction of 
the sequence from MAST2 gene referred 
to as either CpG-^/A, MAST2 SVA or 
SVA Fl element . Similarly to other 
actively retrotransposing elements SVA 
elements show inter-individual varia- 
tion in man and can be polymorphic for 
their absence or presence in the genome. 
It has been estimated that 37.5% of SVA 
E elements and 27.6% of SVA F elements 
are polyiporphic for their presence in the 
genome and the average human is esti- 
mated to have ^^6 SVA absence/presence 
polymorphisms . 

As with other retrotransposons much 
effort has been expended on the ret- 
rotransposition event itself giving rise to 
disease. This has resulted in the identifi- 
cation of at least 8 well characterized dis- 
eases in which the retrotransposition event 
has caused the disease, in the majority of 
cases by affecting alternative slicing 
However our hypothesis is that SVA ele- 
ments would have an effect on epigenetic 
and transcriptional parameters at the locus 
in which they are found in the human 
reference genome without the need for 
retrotransposition. These parameters are 
embedded within the primary sequence 
of the SVA element and in particular the 
tandem repeat domains. 



SVA Elements as a Model 
for Gene x Environment 
(GxE) Interaction in Disease 
Progression and Evolution 

Our hypothesis is that the SVA ele- 
ment could impart a hominid specific 
regulatory twist on regulation of the 
gene in which they have inserted (assum- 
ing that insertion is not in an exon or 
destroys intron/exon boundaries etc). In 
the promoter they would act as classical 
mediators of gene expression. Analysis 
of Human Genome release 19 (Hgl9) 
indicates that of the 2676 SVA elements 
in the human genome, 433 are present 
within lOkb 5|, of the main transcrip- 
tional start site . We have published on 
two such promoter domains approxi- 
mately 8 and lOkb respectively from the 
major transcriptional start site of the 
nearest gene, namely the PARK 7 and 
FUS genes. The following points all sup- 
port a regulatory role for SVA elements 
in the genome: 

1. SVA elements are functional in con- 
ventional reporter gene assay both in^yitrp 
(cell line) and in vivo (chick embryo) 

2. SVA elements are polymorphic, most 
clearly via VNTR domains (flanking and 
internal) . Although SNPs could also 
exist. 

a. We have previously shown in other 
genes (as have many others) that VNTRs 
are both polymorphic biomarkers associ- 
ated with disorders and transcriptional 
regulatory domains in vivo and in vitro 

b. VNTRs are common regulatory 
domains in viruses; 

i. They in part control HSV-1 virus 
latency in dorsal root ganglia (via a simi- 
lar CCCCTC repeat to that found in the 
flanking sequence of the SVA element) 

ii. VNTRs are present in the enhancers 

. . . 26,27 

of many viruses including retroviruses 

3. SVA elements can be associated with 
active chromatin adjacent to: 

a. Positive affymetrix probe arrays in 
the human CNS 

b. ENCODE active histone marks 

4. SVA elements have the potential 
to form structures which may be strong 
modulators of genome structuij^/function: 

a. Potential G4 quadruplex 



b. Strong CpG component approach- 
ing the classical requirements to be 
defined as CpG islands, which is consis- 
tent witj^ijjheir differential methylation in 
tumors 



How do Hominid Specific SVA 
Regulatory Domains Utilize the 
Existing Transcription Machinery 
in the Cell 

In our analysis of the FUS SVA element 
we generated reporter gene constructs that 
were analyzed in both the neuroblastoma 
cell lirie, SK-N-AS and a chicken embryo 
model . In the former, the data was 
consistent with our previous data on the 
PARK7 SVA element in which we were 
able to demonstrate that the SVA element 
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contained multiple regulatory domains . 
This would also be consistent with previ- 
ous data demonstrating promoter func- 
tion with the SVA Fl element . This 
regulatory analysis was extended with the 
FUS SVA element to demonstrate that 
this domain supported expression in vivo 
using the chicken embryo model. This is 
also an example of hominid specific regu- 
latory domains functioning in other spe- 
cies. Previously, we have demonstrated 
the human serotonin transporter VNTR, 
which is not present in rodents, nonethe- 
less supported both tissue specific and dif- 
ferential reporter gene expression in mouse 
embryos whicj^ was directed by genotype 
of the VNTR . In this model the expres- 
sion in the embryo was in the cells which 
at that point in development in the rodent 
first begin to develop a serotoninergic lin- 
eage with serotonin transporter (SLC6A4) 
expression. We hypothesized that the 
observed reporter expression would result 
from the novel regulatory VNTR which 
had evolved taking advantage of the exist- 
ing transcriptional machinery in particu- 
lar subsets of cells that were important 
for serotonin transporter expression and 
that the VNTR had been maintained 
in the genome as it contributed a favor- 
able selection in evolution. We propose 
that a similar mechanism would operate 
at the VNTRs in the SVA element and 
in part this would lead to novel regula- 
tory responses via the SVA element at the 
adjacent endogenous gene. It should also 
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be noted that many VNTRs such as the 
SLC6A4 VNTR in the mouse transgenic 
model above are intronic and therefore 
the location of the SVA element to impact 
such regulatory properties can be quite 
variable at the gene loci. 

The distance of the SVA element from 
its site of action on a specific promoter or 
transcriptional start site can be quite large. 
It has been demonstrated by us and others 
when attempting to reproduce endogenous 
gene expression in transgenic models, that 
regulatory domains utilized by a locus can 
be 100kb+ away from the gene itself Our 
own studies on the endogenous expres- 
sion of the neuropeptide, substance P 
encoded by the TACl gene demonstrated 
that although the gene itself from 5' to 3' 
UTR was 8kb, we required a DNA frag- 
ment of 350kb to reproduce the appropri- 
ate expression patterns . Subsequently 
our collaborators demonstrated key regu- 
latory domains were as far 250 kb and 
100 kb 5'of the TACl gene The 350kb 
TACl fragment as a transgenic insertion 
demonstrated not only the well character- 
ized rodent expression of TACl, but as 
was the case with the SLC6A4 VNTR, it 
also demonstrated human specific TACl 
expression patterns and we again argued 
that the DNA was taking advantage of the 
complement of transcription factor that 
was present in the cells when the DNA in 
the regulatory regions evolved to support 
distinct transcription profiles for this gene. 
By the same mechanism we would argue 
that SVA elements even lOOkb away from 
the transcription start site could modulate 
transcriptional regulation distinct from an 
allele without a SVA element. 

The recent evolution of SVA elements 
has imposed restricted sequence variation 
in each of the SVA subclasses; this may 
allow a concerted response to challenge by 
groups of SVA elements with similar pri- 
mary sequence based on sequence specific 
DNA binding proteins. In this manner it 
might be possible for SVA elements con- 
taining similar primary sequence such as 
is present in the hexamer repeat or cen- 
tral VNTRs to bind similar transcrip- 
tion factors based on shared consensus 



DNA binding domains. Thus alterations 
in active transcription factor complement 
in the cell might modulate several SVA 
elements by the same signal transduction 
pathways. 

Can SVA Elements be 
Responsible for Emerging 
Diseases? 

The frequency of SVA element ret- 
rotransposition is estimated to be 1:91^ 
births which would correspond to 7 x 10 
disting: insertions worldwide in the popu- 
lation . These could result in modulation 
of transcriptional properties without mod- 
ifying protein structure or splicing param- 
eters. Could these constitute a proportion 
of the rare event genetic components that 
underlie disease processes? We can argue 
that SVA elements should be assigned 
a greater role in disease without the ret- 
rotransposition event and that due in part 
to the nature of the VNTR elements; they 
could be biomarkers of predisposition to 
a specific disorder. In our analysis of SVA 
elements located close to FUS and RARK7 
genes, we have addressed the polymorphic 
variation in the VNTRs to establish the 
range of VNTR repeat number in these 
domains. We found in both of these genes 
that only one of the central tandem repeat 
was polymorphic while the other was a 
fixed repeat length. We also confirmed 
in PARK? that the hexamer was a repeat 
and 3 length variants were observed in the 
HapMap CEU cohort. For the FUS anal- 
ysis we performed polymorphic analysis 
in sporadic ALS vs. controls (241 and 228 
respectively). The FUS SVA element does 
not contain a flanking hexamer so we 
analyzed only the central repeat domain 
and found 2 variants. The data implied 
a difference in the homozygous short vs 
heterozygous long/short SVA element 
allele carriers in the two cohorts for asso- 
ciation with sporadic ALS. This might 
suggest that in a larger cohort we might 
reach significance for these genotypes. 
A technical problem of such analysis is 
that PCR over large repeat units is not 



simplistic and each PCR requires signifi- 
cant optimisation, therefore the require- 
ment to do this PCR for large cohorts 
might prove a hindrance to such studies. 
However, it should be possible to find tag- 
ging SNPs for SVA elements which dem- 
onstrate limited polymorphism as in the 
FUS SVA element. We have done this by 
converting the long and short SVA ele- 
ment genotype of the individuals from 
the CEU cohort to "SNPs" so they could 
be uploaded into the Haploview software 
to allow for identification of tagging 
SNPs that can be used to address correla- 
tion with disease in larger cohorts in the 
future. This approach will allow us in the 
future to generate tagging SNPs for SVA 
elements for addressing association to 
disease. This is of course dependent on a 
restricted number of SVA "VNTR" alleles 
in a specifically targeted SVA element. In 
the limited number of SVA elements we 
have addressed, including PARK7 and 
FUS, we have evidence for two major 
VNTR alleles in the majority with none 
exhibiting greater variation than PARK7; 
however the analysis is far from complete. 
Nevertheless such a strategy may allow for 
integration of SVA element genotype as a 
correlate of human disease more rapidly 
than PCR analysis of 1000s of individual 
SVA elements. 

Concluding Remarks 

Our recent work suggests that SVA ele- 
ments can in part modulate genome func- 
tion via their action as a transcriptional 
or epigenetic modulator. Polymorphism 
in the VNTRs within these retrotranspo- 
sons demonstrates a mechanism not only 
for differential regulatory function associ- 
ated with genotype but a way to rapidly 
integrate polymorphism in the SVA ele- 
ment as a potential biomarker for disease 
association without the requirement for 
retrotransposition. 
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